Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games

نویسندگان

  • H. L. Prasad
  • Prashanth L. A.
  • Shalabh Bhatnagar
چکیده

We consider the problem of finding stationary Nash equilibria (NE) in a finite discounted general-sum stochastic game. We first generalize a non-linear optimization problem from [9] to a general N player game setting. Next, we break down the optimization problem into simpler sub-problems that ensure there is no Bellman error for a given state and an agent. We then provide a characterization of solution points of these sub-problems that correspond to Nash equilibria of the underlying game and for this purpose, we derive a set of necessary and sufficient SG-SP (Stochastic Game Sub-Problem) conditions. Using these conditions, we develop two provably convergent algorithms. The first algorithm OFF-SGSP is centralized and model-based, i.e., it assumes complete information of the game. The second algorithm ON-SGSP is an online model-free algorithm. We establish that both algorithms converge, in self-play, to the equilibria of a certain ordinary differential equation (ODE), whose stable limit points coincide with stationary NE of the underlying general-sum stochastic game. On a single state non-generic game [12] as well as on a synthetic two-player game setup with 810, 000 states, we establish that ON-SGSP consistently outperforms NashQ [16] and FFQ [21] algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algorithms for Nash Equilibria in General-Sum Stochastic Games

Over the past few decades the quest for algorithms to compute Nash equilibria in general-sum stochastic games has intensified and several important algorithms (cf. [9], [12], [16], [7]) have been proposed. However, they suffer from either lack of generality or are intractable for even medium sized problems or both. In this paper, we first formulate a non-linear optimization problem for stochast...

متن کامل

Robust Learning for Repeated Stochastic Games via Meta-Gaming

This paper addresses learning in repeated stochastic games (RSGs) played against unknown associates. Learning in RSGs is extremely challenging due to their inherently large strategy spaces. Furthermore, these games typically have multiple (often infinite) equilibria, making attempts to solve them via equilibrium analysis and rationality assumptions wholly insufficient. As such, previous learnin...

متن کامل

A Study of Gradient Descent Schemes for General-Sum Stochastic Games

Zero-sum stochastic games are easy to solve as they can be cast as simple Markov decision processes. This is however not the case with general-sum stochastic games. A fairly general optimization problem formulation is available for general-sum stochastic games by Filar and Vrieze [2004]. However, the optimization problem there has a non-linear objective and non-linear constraints with special s...

متن کامل

Learning with Partial Observations in General-sum Stochastic Games

In many situations, multiagent systems must deal with partial observability that agents have in the environment. In these cases, finding optimal solutions is often intractable for more than two agents and approximated solutions are often the only way to solve these problems. The models known to represent this kind of problem is Partially Observable Stochastic Game (POSG). Such a model is usuall...

متن کامل

Stochastic Learning of Equilibria in Games: The Ordinary Differential Equation Method

Our purpose is to discuss stochastic algorithms to learn equilibria in games, and their time of convergence. To do so, we consider a general class of stochastic algorithms that converge weakly (in the sense of weak convergence for stochastic processes) towards solutions of particular ordinary differential equations, corresponding to their mean-field approximations. Tuning parameters in these al...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015